arXiv:l502.02725v2 [cs.DC] 4 Mar 2016 


Why Transactional Memory Should Not Be Obstruction-Free 


Petr Kuznetsov 1 Srivatsan Ravi 2 
1 Telecom ParisTech 
2 TU Berlin 


Abstract 

Transactional memory (TM) is an inherently optimistic abstraction: it allows concurrent 
processes to execute sequences of sliared-data accesses (transactions) speculatively, with an 
option of aborting them in the future. Early TM designs avoided using locks and relied on 
non-blocking synchronization to ensure obstruction-freedom: a transaction that encounters no 
step contention is not allowed to abort. However, it was later observed that obstruction- 
free TMs perform poorly and, as a result, state-of-the-art TM implementations are nowadays 
blocking , allowing aborts because of data conflicts rather than step contention. 

In this paper, we explain this shift in the TM practice theoretically, via complexity bounds. 
We prove a few important lower bounds on obstruction-free TMs. Then we present a lock- 
based TM implementation that beats all of these lower bounds. In sum, our results exhibit a 
considerable complexity gap between non-blocking and blocking TM implementations. 


1 Introduction 


Transactional memory (TM) allows concurrent processes to organize sequences of operations on 
shared data items into atomic transactions. A transaction may commit, in which case its updates 
of data items “take effect” or it may abort , in which case no data items are updated. A TM 
implementation provides processes with algorithms for implementing transactional operations on 
data items (such as read, write and tryCommit ) by applying primitives on shared base objects. 
Intuitively, the idea behind the TM abstraction is optimism: before a transaction commits, all its 
operations are speculative, and it is expected that, in the absence of concurrency, a transaction 
commits. 


It therefore appears natural that early TMs implementations 113,20,25 29,30 adopted op 


timistic concurrency control and guaranteed that a prematurely halted transaction cannot not 
prevent other transactions from committing. These implementations avoided using locks and 
relied on non-blocking (sometimes also called lock-free ) synchronization. Possibly the weakest 
non-blocking progress condition is obstruction-freedom [l9,[21| stipulating that every transaction 
running in the absence of step contention, i.e., not encountering steps of concurrent transactions, 
must commit. 

In 2005, Ennals 112 argued that that obstruction-free TMs inherently yield poor performance, 
because they require transactions to forcefully abort each other. Ennals further describes a lock- 


based TM implementation 111 that he claimed to outperform DSTM 20 , the most referenced 


obstruction-free TM implementation at the time. Inspired by [12j, more recent TM implementa¬ 
tions like TL [8], TL2 [T] and NOrec 16 employ locking and showed that Ennal’s claims about 
performance of lock-based TMs hold true on most workloads. The progress guarantee provided 
by these TMs is typically progressiveness: a transaction may be aborted only if it encounters a 
read-write or a write-write conflicts with a concurrent transaction |16|. 

There is a considerable amount of empirical evidence on the performance gap between non- 
blocking (obstruction-free) and blocking (progressive) TM implementations but, to the best of our 
knowledge, no analytical result explains it. Complexity lower and upper bounds presented in this 
paper provide such an explanation. 











Lower bounds for non-blocking TMs. Our first result focuses on two important TM proper¬ 
ties: weak disjoint-access-parallelism (weak DAP) and read invisibility. Weak DAP |5| is believed 
to improve TM performance by ensuring that transactions concurrently contend on the same base 
object (both access the base object and at least one updates it) only if their data sets are connected 
in the conflict graph constructed on the data sets of concurrent transactions [5]]. Many popular 
obstruction-free TM implementations satisfy weak DAP [13,20,301, but not the stronger property 


of strict DAP 14,17 that disallows any two transactions to contend on a base object unless they 
access a common data item. 

A TM implementation uses invisible reads if, informally, a reading transaction cannot cause 
a concurrent transaction to abort (we give a more precise definition later in this paper), which is 
believed to be important for (most commonly observed) read-dominated workloads. Interestingly, 
lock-based TM implementations like TL |8| are weak DAP and use invisible reads. In contrast, 
we establish that it is impossible to implement a strictly serializable (all committed transac¬ 
tions appear to execute sequentially in some total-order respecting the timing of non-overlapping 
transactions) obstruction-free TM that provides both weak DAP and read invisibility. Indeed, 
obstructions TMs like DSTM |20] and FSTM |13| satisfy weak DAP, but not read invisibility 
since read operations must write to the shared memory. 

We then derive lower bounds on obstruction-free TM implementations with respect to the 
number of stalls 
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The stall complexity captures the fact that the time a process might have 
to spend before it applies a primitive on a base object can be proportional to the number of 
processes that try to concurrently update the object 10 . Our second result shows that a single 


read operation in a n-process strictly serializable obstruction-free TM implementation may incur 
D(n) stalls. 

Finally, we prove that any read-write (RW) DAP opaque (all transactions appear to execute se¬ 
quentially in some total-order respecting the timing of non-overlapping transactions) obstruction- 
free TM implementation has an execution in which a read-only transaction incurs D(n) non- 
overlapping RAWs or AWARs. Intuitively, RAW (read-after-write) or AWAR (atomic-write-after¬ 
read) patterns [3| capture the amount of “expensive synchronization”, i.e., the number of costly 
conditional primitives or memory barriers 111 incurred by the implementation. The metric appears 
to be more practically relevant than simple step complexity, as it accounts for expensive cache- 
coherence operations or conditional instructions. RW DAP, satisfied by most obstruction-free 
implementations 13,20 , requires that read-only transactions do not contend on the same base 


object with transactions having disjoint write sets. It is stronger than weak DAP |5|, but weaker 
than strict DAP 
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Figure 1: Complexity gap between blocking and non-blocking strictly serializable TM implemen¬ 


tations; n is the number of processes 


An upper bound for blocking TMs. To exhibit a complexity gap between blocking and non- 
blocking TMs, we describe a progressive opaque TM implementation that beats the impossibility 
result and the lower bounds we established for obstruction-free TMs. 

Our implementation, denoted LP, (1) uses only read and write primitives on base objects and 
ensures that every transactional operation terminates in a wait-free manner, (2) ensures strict 
DAP, (3) has invisible reads, (4) performs 0(1) non-overlapping RAWs/AWARs per transaction, 
and (5) incurs 0(1) memory stalls for read operations. In contrast, the following claims hold for 
any implementation in the class of obstruction-free (OF) strict serializable TMs: No OF TM can 
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be implemented (i) using only read and write primitives and provide wait-free termination 17], 
or (ii) provide strict DAP (l5|. Furthermore, (iii) no weak DAP OF TM has invisible reads 
(Theorem and (iv) no OF TM ensures a constant number of stalls incurred by a read operation 
(Theorem J5|) . Finally, (v) no RW DAP opaque OFTM has constant RAW/AWAR complexity 
(Theorem pi). Ri fact, (iv) and (v) exhibit a linear separation between blocking and non-blocking 
TMs w.r.t expensive synchronization and memory stall complexity, respectively. 

Our results are summarized in Figure [lj Altogether, we grasp a considerable complexity gap 
between blocking and non-blocking TM implementations, justifying theoretically the shift in TM 
practice we observed during the past decade. 

Roadmap. Sections [2] and [3] define our model and the classes of TMs considered in this paper. 
Section [4] contains lower bounds for obstruction-free TMs. Section [5] describes our lock-based 
TM implementation LP. In Section [6j we discuss the related work and in Section [7J concluding 
remarks. Some proofs are delegated to the optional appendix. 


2 Model 


TM interface. Transactional memory (in short, TM) allows a set of data items (called t-objects ) 
to be accessed via atomic transactions. Every transaction T *. has a unique identifier k. We make 
no assumptions on the size of a t-object, i.e., the cardinality on the set V of possible values a 
t-object can store. A transaction T). may contain the following t-operations, each being a matching 
pair of an invocation and a response: read k (X) returns a value in V or a special value A k ^ V 
(abort)-, writek(X,v), for a value v E V, returns ok or A/-; tryC k returns C k ^ V (commit ) or A k . 


TM implementations. We consider an asynchronous shared-memory system in which a set of n 
processes, communicate by applying primitives on shared base objects. We assume that processes 
issue transactions sequentially i.e. a process starts a new transaction only after the previous 
transaction has committed or aborted. A TM implementation provides processes with algorithms 
for implementing read k , write k and tryC k () of a transaction T k by applying primitives from a set 
of shared base objects, each of which is assigned an initial value. We assume that these primitives 
are deterministic. A primitive is a generic read-modify-write (RMW ) procedure applied to a base 
object (lO, 18 . It is characterized by a pair of functions ( g,h): given the current state of the 


base object, g is an update function that computes its state after the primitive is applied, while 
h is a response function that specifies the outcome of the primitive returned to the process. A 
RMW primitive is trivial if it never changes the value of the base object to which it is applied. 
Otherwise, it is nontrivial. 


Executions and configurations. An event of a transaction T k (sometimes we say step of T k ) is 
an invocation or response of a t-operation performed by T k or a RMW primitive (g, h) applied by 
T k to a base object b along with its response r (we call it a RMW event and write ( b , (g, h),r,k)). 

A configuration (of a TM implementation) specifies the value of each base object and the state 
of each process. The initial configuration is the configuration in which all base objects have their 
initial values and all processes are in their initial states. 

An execution fragment is a (finite or infinite) sequence of events. An execution of a TM 
implementation M is an execution fragment where, starting from the initial configuration, each 
event is issued according to M and each response of a RMW event (b, (g,h),r,k) matches the 
state of b resulting from all preceding events. An execution E ■ E' denotes the concatenation of E 
and execution fragment E' , and we say that E' is an extension of E or E' extends E. 

Let E be an execution fragment. For every transaction (resp., process) identifier k, E\k 
denotes the subsequence of E restricted to events of transaction T k (resp., process p k ). If E\k is 
non-empty, we say that T k (resp., p k ) participates in E, else we say E is T k -free (resp., p k -free). 
Two executions E and E' are indistinguishable to a set T of transactions, if for each transaction 
T k E T, E\k = E'\k. A TM history is the subsequence of an execution consisting of the invocation 
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and response events of t-operations. Two histories H and H' are equivalent if txns(H) = txns(H') 
and for every transaction Tk E txns(H), H\k = H'\k. 

The read set (resp., the write set ) of a transaction Tk in an execution E, denoted Rset(Tk ) 
(resp., Wset(Tk)), is the set of t-objects that T\ reads (resp., writes to) in E. More specifically, 
if E contains an invocation of readk(X) (resp., writek(X,v )), we say that X E Rset(Tk ) (resp., 
Wset(Tk)). The data set of X), is Dseti^Ty .) = Rset(Ti~) U WsetiT^). A transaction is called read-only 
if Wset(Tk) = 0 ; write-only if Rset(Tk) = 0 and updating if WsetiTk) / 0 . Note that we consider 
the conventional dynamic TM programming model: the data set of a transaction is not known 
apriori (i.e., at the start of the transaction) and it is identifiable only by the set of t-objects the 
transaction has invoked a read or write in the given execution. 

Transaction orders. Let txns(E) denote the set of transactions that participate in E. An 
execution E is sequential if every invocation of a t-operation is either the last event in the history 
H exported by E or is immediately followed by a matching response. We assume that executions 
are well-formed i.e. for all T),, E\k begins with the invocation of a t-operation, is sequential and 
has no events after Aj, or Cf~. A transaction T E txns(E) is complete in E if E\k ends with a 
response event. The execution E is complete if all transactions in txns(E) are complete in E. A 
transaction X \ E txns(E) is t-complete if E\k ends with Ak or Cj~\ otherwise, X *. is t-incomplete. 
Tk is committed (resp., aborted ) in E if the last event of X \ is Ck (resp., Ak). The execution E is 
t-complete if all transactions in txns(E) are t-complete. 

For transactions {7\,T m } E txns(E), we say that Tk precedes T m in the real-time order of E, 
denoted Tk T m , if Tk is t-complete in E and the last event of Tk precedes the first event of T m 
in E. If neither Tk T m nor T m T Tk, then Tk and T m are concurrent in E. An execution 
E is t-sequential if there are no concurrent transactions in E. We say that readk(X) is legal in a 
t-sequential execution E if it returns the latest written value of X in E, and E is legal if every 
readk(X) in E that does not return Ak is legal in E. 

Contention. We say that a configuration C after an execution E is quiescent (resp., t-quiescent ) 
if every transaction Tk E txns(E) is complete (resp., t-complete) in C. If a transaction T is 
incomplete in an execution E, it has exactly one enabled event, which is the next event the 
transaction will perform according to the TM implementation. Events e and e' of an execution E 
contend on a base object b if they are both events on b in E and at least one of them is nontrivial 
(the event is trivial (resp., nontrivial) if it is the application of a trivial (resp., nontrivial) primitive). 

We say that T is poised to apply an event e after E if e is the next enabled event for T in 
E. We say that transactions T and T' concurrently contend on b in E if they are poised to apply 
contending events on b after E. 

We say that an execution fragment E is step contention-free for t-operation opk if the events 
of E\opk are contiguous in E. We say that an execution fragment E is step contention-free for 
Tk if the events of E\k are contiguous in E. We say that E is step contention-free if E is step 
contention-free for all transactions that participate in E. 

3 TM classes 

In this section, we define the properties of TM implementations considered in this paper. 
TM-correctness. Informally, a t-sequential history S is legal if every t-read of a t-object returns 
the latest written value of this t-object in S. A history H is opaque if there exists a legal t- 
sequential history S equivalent to H such that S respects the real-time order of transactions 
in H [17] . A weaker condition called strict serializability ensures opacity only with respect to 
committed transactions. Precise definitions can be found in Appendix |A| 

TM-liveness. We say that a TM implementation M provides obstruction-free (OF) TM-liveness 
if for every finite execution E of M, and every transaction X*. that applies the invocation of a 
t-operation opk immediately after E, the finite step contention-free extension for opk contains a 
matching response. A TM implementation M provides wait-free TM-liveness if in every execution 
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of M, every t-operation returns a matching response in a finite number of its steps. 
TM-progress. Progress for TMs specifies the conditions under which a transaction is allowed 
to abort. We say that a TM implementation M provides obstruction-free (OF) TM-progress if 
for every execution E of M, if any transaction T*. E txns(E) returns in E. then E is not step 
contention-free for T),. 

We say that transactions T;. Tj conflict in an execution E on a t-object X if Tj and Tj are 
concurrent in E and X E Dset(Tfl)C\DseflTj), and X E IFset(Tj)U IFset(T)). A TM implementation 
M provides progressive TM-progress (or progressiveness ) if for every execution E of M and every 
transaction Tj E txns(E) that returns A* in E, there exists prefix E' of E and a transaction 
Tk E txns(E') such that T), and Tj conflict in E. 

Read invisibility. Informally, the invisible reads assumption prevents TM implementations 
from applying nontrivial primitives during t-read operations and from announcing read sets of 
transactions during tryCommit. 

We say that a TM implementation M uses invisible reads if for every execution E of M, 

• for every read-only transaction TE txns(E), no event of E\k is nontrivial in E , 

• for every updating transaction T% E txns(E)\ RseflTfl) 0 , there exists an execution E' of 
M such that 

— RseflTk ) = 0 in E' 

— txns(E) = txns(E') and VT m E txns(E) \ {Tk}: E\m = E'\m 

— for any two step contention-free transactions Tj,Tj E txns(E), if the last event of Tj 
precedes the first event of Tj in E, then the last event of Tj precedes the first event of 
Tj in E'. 

Most popular TM implementations like TL2 [7j and NOrec |6| satisfy this definition of invisible 
reads. 

Disjoint-access parallelism (DAP). A TM implementation M is strictly disjoint-access parallel 
(strict DAP) if, for all executions E of M, and for all transactions Tj and Tj that participate in 
E, Ti and Tj contend on a base object in E only if DseflTj ) n DseflTj) 0 (l7| . 

We now describe two relaxations of strict DAP. For the formal definitions, we introduce the 
notion of a conflict graph which captures the dependency relation among t-objects accessed by 
transactions. 

We denote by te (Tj, Tj ), the set of transactions (Tj and Tj included) that are concurrent to at 
least one of Tj and Tj in an execution E. 

Let G(Ti,Tj , E) be an undirected graph whose vertex set is (J Dset(T) and there is an 

T&T E (Ti,Tj) 

edge between t-objects X and Y iff there exists T E te (T ), Tj ) such that {X, Y} E Dset{T). We 
say that Tj and Tj are disjoint-access in E if there is no path between a t-object in DseflTfl and 
a t-object in DseflTj) in G(T,Tj, E). A TM implementation M is weak disjoint-access parallel 
(weak DAP) if, for all executions E of M, transactions Tj and Tj concurrently contend on the 
same base object in E only if Tj and Tj are not disjoint-access in E or there exists a t-object 
X E Dset(T {) n Dset(Tj) j5 

Let G(Ti,Tj, E) be an undirected graph whose vertex set is {J TGrE ( Ti T .j DseflT) and there is 
an edge between t-objects X and Y iff there exists T E TE^TjjTj) such that {X, Y} E WseflT). 
We say that Tj and Tj are read-write disjoint-access in E if there is no path between a t-object 
in Dset(Tj) and a t-object in DseflTj ) in G(Ti,Tj,E). A TM implementation M is read-write 
disjoint-access parallel (RW DAP) if, for all executions E of M, transactions Tj and Tj contend 
on the same base object in E only if Tj and Tj are not read-write disjoint-access in E or there 
exists a t-object X E DseflTj) n DseflTj). 

We make the following observations about the DAP definitions presented in this paper. 

• From the definitions, it is immediate that every RW DAP TM implementation satisfies weak 
DAP. But the converse is not true. Consider the following execution E of a weak DAP TM 
implementaton M that begins with the t-incomplete execution of a transaction To that reads 
X and writes to Y, followed by the step contention-free executions of two transactions Ti 
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Rq(Z) —> v Wo(X, nv) tryCn 

T 0 I-1-1- 


Rl(X) — » V (event of T 0 ) —> TIV 

Ti I , , -1 2 T 2 I . I 


initial value 


(a) T2 returns new value of X since Ti is invisible 


Ro(Z) -> v Wo(X,nv) tryC 0 Ri(X) -4 v (mictT,) Wz(Z,nv) Ri(X) -4 no 

To I-1-1- Ti I -1 o Ti I—-—I T 2 I-;-1 

initial value ^ "— — — 


write new value 


(b) T2 and T3 do not contend on any base object 


Ro(z) -4 v W 0 (X,nv) tryC 0 Ri(X) -4 v W 3 (Z,nv) (event of t 0 ) iJ 2 (X) -4 nv 

To I-1-1- T I ——- I Ti |---—I o T 2 I---1 

initial value write new value c new value 

(c) T3 does not access the base object from the nontrivial event e 


Ro(Z) —> v Wo(X, nv) tryC 0 

T u I-1-1--- 


W 3 (Z,nv) Ri(X) v (event of T 0 ) R 2 (X) —> nv 

T s I—--H Ti | . .1 2 T, |- 7 -1 

write new value — ^ 


initial value 


(d) T3 and T\ do not contend on any base object 


Figure 2: Executions in the proof of Theorem [ 2 J execution in 2d is not strictly serializable 


and X 2 which write to X and read Y respectively. Transactions T\ and T 2 may contend 
on a base object since there is a path between X and Y in G(Ti, T 2 , E). However, a RW 
DAP TM implementation would preclude transactions 7j and T 2 from contending on the 
same base object: there is no edge between t-objects X and Y in the corresponding conflict 
graph G{T\, T%, E) because X and Y are not contained in the write set of To. Algorithm [ 3 ] in 
Appendix |B.2| describes a TM implementation that satisfies weak DAP, but not RW DAP. 

• From the definitions, it is immediate that every strict DAP TM implementation satisfies RW 
DAP. But the converse is not true. To understand why, consider the following execution E of 
a RW DAP TM implementaton that begins with the t-incomplete execution of a transaction 
To that accesses t-objects X and Y, followed by the step contention-free executions of two 
transactions T± and T 2 which access X and Y respectively. Transactions T\ and T 2 may 
contend on a base object since there is a path between X and Y in G(Ti,Tj, E). However, 
a strict DAP TM implementation would preclude transactions T± and T 2 from contending 


describes a TM implementation that satisfies RW DAP, but not strict DAP. 


on the same base object since Dset(T\) n Dset(T 2 ) = 0 in E. Algorithm [2] in Appendix B.l 


4 Lower bounds for obstruction-free TMs 


Let OJ- denote the class of TMs that provide OF TM-progress and OF TM-liveness. In Section 4.1 
we show that no strict serializable TM in OJ- can be weak DAP and have invisible reads. In 
we determine stall complexity bounds for strict serializable TMs in OJ-, and in 
we present a linear (in n) lower bound on RAW/AWARs for RW DAP opaque TMs 


Section 4.2 


Section 4.3 
in OT. 


4.1 Impossibility of invisible reads 

In this section, we prove that it is impossible to derive TM implementations in OT that combine 
weak DAP and invisible reads. The following lemma will be useful in proving our result. 

Lemma 1. ( (Sj, Let M be any weak DAP TM implementation. Let ol • p\ ■ p 2 be any 

execution of M where pi (resp., P 2 ) is the step contention-free execution fragment of transaction 
T\ 0 txns(a) (resp., T 2 fL txns(a)) and transactions T\, T 2 are disjoint-access in a - p\- p 2 - Then, 
T\ and T 2 do not contend on any base object in a ■ pi ■ p 2 - 
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Theorem 2. There does not exist a weak DAP strictly serializable TM implementation in OT 
that uses invisible reads. 


Proof. By contradiction, assume that such an implementation M E OT exists. Let v be the initial 
value of t-objects X and Z. Consider an execution E of M in which a transaction To performs 
reado(Z) —>• v (returning v), writes nr / r to X , and commits. Let E' denote the longest prefix of 
E that cannot be extended with the t-complete step contention-free execution of transaction T\ 
that performs a t-read X and returns nv nor with the t-complete step contention-free execution 
of transaction T 2 that performs a t-read of X and returns nv. 

Let e be the enabled event of transaction To in the configuration after E'. Without loss of 
generality, assume that E' ■ e can be extended with the t-complete step contention-free execution 
of committed transaction T 2 that reads X and returns nv. Let E' ■ e ■ E 2 be such an execution, 
where E 2 is the t-complete step contention-free execution fragment of transaction T 2 that performs 
read 2 (X) —y nv and commits. 

We now prove that M has an execution of the form E 1 ■ E\-e-E 2 , where E\ is the t-complete step 
contention-free execution fragment of transaction T\ that performs read] (X) —y v and commits. 

We observe that E'-E\ is an execution of M. Indeed, by OF TM-progress and OF TM-liveness, 
T\ must return a matching response that is not A\ in E' ■ E\, and by the definition of E', this 
response must be the initial value v of X. 

By the assumption of invisible reads, E\ does not contain any nontrivial events. Consequently, 
E' ■ E\ ■ e ■ E 2 is indistinguishable to transaction X 2 from the execution E' ■ e ■ E 2 . Thus, E' ■ E\ ■ e • £2 
is also an execution of M (Figure [2a]). 

Claim 3. M has an execution of the form E' ■ E\ ■ E 3 ■ e ■ E 2 where E 3 is the t-complete step 
contention-free execution fragment of transaction T 3 that writes nv ^ v to Z and commits. 


Proof. The proof is through a sequence of indistinguishability arguments to construct the execu¬ 
tion. 

We first claim that M has an execution of the form E' ■ E\ ■ e ■ E 2 • E 3 . Indeed, by OF 
TM-progress and OF TM-liveness, T 3 must be committed in E' ■ E\ ■ e ■ E 2 ■ £ 3 . 

Since M uses invisible reads, the execution E' ■ E± ■ e ■ E 2 ■ £3 is indistinguishable to transactions 
X 2 and T 3 from the execution £ • £2 • £ 3 , where E is the t-incomplete step contention-free execution 
of transaction To with Wset^{To) = {A^}; Rset^{To) = 0 that writes nv to X. 

Observe that the execution E' ■ E\ ■ e ■ E 2 ■ E 3 is indistinguishable to transactions T 2 and T 3 from 
the execution E ■ E 2 ■ E 3 , in which transactions T 3 and T 2 are disjoint-access. Consequently, by 
Lemma [lj T 2 and T 3 do not contend on any base object in £ ■ E 2 • E 3 . Thus, M has an execution 
of the form E' ■ E\ ■ e ■ E 3 ■ E 2 (Figure [2b|). 

By definition of E ', To applies a nontrivial primitive to some base object, say b, in event e that 
T 2 must access in E 2 . Thus, the execution fragment E 3 does not contain any nontrivial event on 
b in the execution E' ■ E\ ■ e ■ E 2 ■ E 3 . Infact, since T 3 is disjoint-access with To in the execution 
£■ E 3 ■ E 2 , by Lemma[lJ it cannot access the base object b to which To applies a nontrivial primitive 
in the event e. Thus, transaction T 3 must perform the same sequence of events E 3 immediately 
after E ', implying that M has an execution of the form E' ■ E\ ■ E 3 ■ e ■ E 2 (Figure 2c). □ 


Finally, we observe that the execution E'-E\-E 3 -e-E 2 established in Claim[3]is indistinguishable 
to transactions Ti and T 3 from an execution E • E\ • E 3 ■ e ■ £ 2 , where Wset(To) = {A} and 
Rset(To ) = 0 in E. But transactions T 3 and T\ are disjoint-access in E ■ E\ ■ E 3 ■ e ■ E 2 and 
by Lemma [lj Ti and T 3 do not contend on any base object in this execution. Thus, M has an 
execution of the form E' ■ E 3 ■ E\ ■ e ■ E 2 (Figure [2d] ) in which T 3 precedes T\ in real-time order. 

However, the execution E' • E 3 ■ E\ ■ e ■ E 2 is not strictly serializable: To must be committed 
in any serialization and transaction Ti must precede Tq since read\{X) returns the initial value 
of A. To respect real-time order, T 3 must precede Ti, while To must precede T 2 since read 2 (X) 
returns nv, the value of A updated by To. Finally, To must precede T 3 since reado(Z) returns the 
initial value of Z. But there exists no such serialization—contradiction. □ 
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4.2 Stall complexity 

Let M be any TM implementation. Let e be an event applied by process p to a base object b as it 
performs a transaction T during an execution E of M. Let E = a ■ e\ ■ ■ ■ e m ■ e ■ (3 be an execution 
of M, where a and (3 are execution fragments and e\ ■ ■ ■ e m is a maximal sequence of m > 1 
consecutive nontrivial events by distinct distinct processes other than p that access b. Then, we 
say that T incurs m memory stalls in E on account of e. The number of memory stalls incurred 
by T in E is the sum of memory stalls incurred by all events of T in E [2, 

In this section, we prove a lower bound of n — 1 on the worst case number of stalls incurred 
by a transaction as it performs a single t-read operation. We adopt the following definition of a 
k-stall execution from |2, 

Definition 1. An execution a,-o\ ■ ■ ■ Oi is a k-stall execution for t-operation op executed by process 
p if 

• a is p-free, 

• there are distinct base objects b\,... ,bi and disjoint sets of processes Sj,..., S t whose union 
does not include p and has cardinality k such that, for j = 1,... i, 

— each process in Sj has an enabled nontrivial event about to access base object bj after 
a, and 

— in crj, p applies events by itself until it is the first about to apply an event to bj, then 
each of the processes in Sj applies an event that accesses bj, and finally, p applies an 
event that accesses bj, 

• p invokes exactly one t-operation op in the execution fragment a\ ■ ■ ■ Oi 

• <j\ - ■ ■ oi contains no events of processes not in ({p} U S\ U • • • U Si) 

• in every ({p} U Si U • • • U Si)-free execution fragment that extends a, no process applies a 
nontrivial event to any base object accessed in a\- ■ ■ a^. 

Observe that in a £;-stall execution E for t-operation op, the number of memory stalls incurred 
by op in E is k. 

Lemma 4. Let a ■ a\ • • • Oi be a k-stall execution for t-operation op executed by process p. Then, 
a ■ o\ ■ ■ ■ Oi is indistinguishable to p from a step contention-free execution S'- 

Theorem 5. Every strictly serializable TM implementation M G OT has a (n-l)-stall execution 
E for a t-read operation performed in E. 

Proof. We proceed by induction. Observe that the empty execution is a 0-stall execution since it 
vacuously satisfies the invariants of Definition [lj 

Let v be the initial value of t-objects X and Z. Let a = ot\ ■ ■ ■ a n -2 be a step contention-free 
execution of a strictly serializable TM implementation M G OT, where for all j G {1,..., n — 2}, 
OLj is the longest prefix of the execution fragment otj that denotes the t-complete step-contention 
free execution of committed transaction Tj (invoked by process pj) that performs readj(Z) —> v, 
writes value nv v to X in the execution a\ ■ ■ ■ otj -1 • atj such that 

• tryCj () is incomplete in otj , 

• on • • ■ a.j cannot be extended with the t-complete step contention-free execution fragment 
of any transaction T n _ 1 or T n that performs exactly one t-read of X that returns nv and 
commits. 

Assume, inductively, that a ■ o\ • • • Oi is a fc-stall execution for read n (X) executed by process p n , 
where 0 < k < n — 2. By Definition [lj there are distinct base objects b\,...bi accessed by disjoint 
sets of processes Si... Si in the execution fragment a\ ■ ■ ■ a t , where | S± U ... U Si\ = k and a\ ■ ■ • cr* 
contains no events of processes not in ST U ... U Si U {p n }- We will prove that there exists a 
(k + £/)-stall execution for readn(X), for some k' > 1. 

By Lemma [4j a-cr\ - • • Oi is indistinguishable to T n from a step contention-free execution. Let 0 
be the finite step contention-free execution fragment that extends a ■ ■ ■ ■ cq in which T n performs 


10 


10 


8 






events by itself: completes readn.(X) and returns a response. By OF TM-progress and OF TM- 
liveness, readn(X) and the subsequent tryC k must each return non -A n responses in a ■ 07 ■ ■ ■ 07 • cr. 
By construction of a and strict serializability of M, readn(X) must return the response v or nv 
in this execution. We prove that there exists an execution fragment 7 performed by some process 
Pn-i 0 ({ p n } U Si U ■ ■ ■ U Si) extending a that contains a nontrivial event on some base object 
that must be accessed by readn(X) in 07 ■ ■ ■ <Tj • a. 

Consider the case that readn(X) returns the response nv in a ■ 07 ■ ■ ■ cti ■ a. We define a step 
contention-free fragment 7 extending a that is the t-complete step contention-free execution of 
transaction T n _ 1 executed by some process p n -\ 0 ({p n }USiU- • -US’*) that performs read n -i(X) —> 
v, writes nv ^ v to Z and commits. By definition of a, OF TM-progress and OF TM-liveness, M 
has an execution of the form 0-7. We claim that the execution fragment 7 must contain a nontrivial 
event on some base object that must be accessed by readn(X) in 07 • • • <x7 • o\ Suppose otherwise. 
Then, readn(X) must return the response nv in 07 ■ ■ • 07 ■ a. But the execution a ■ 07 • • • 07 • a is 
not strictly serializable. Since readn(X) —> nv, there exists a transaction T q E txns(a ) that must 
be committed and must precede T n in any serialization. Transaction T n _i must precede T n in any 
serialization to respect the real-time order and T n _i must precede T q in any serialization. Also, 
T q must precede T n _i in any serialization. But there exists no such serialization. 

Consider the case that readLn,{X) returns the response v in a-07 • ■ • 07 -a. In this case, we define 
the step contention-free fragment 7 extending a as the t-complete step contention-free execution 
of transaction T ra _ 1 executed by some process p n -\ 0 ({ p n } U S\ U • • • U Si) that writes nv / v to 
X and commits. By definition of a, OF TM-progress and OF TM-liveness, M has an execution of 
the form a ■ 7. By strict serializability of M, the execution fragment 7 must contain a nontrivial 
event on some base object that must be accessed by readn(X) in 07 • • • 07 • a. Suppose otherwise. 
Then, 07 • • • 07 ■ r y-a is an execution of M in which readn.(X) —> v. But this execution is not strictly 
serializable: every transaction T q E txns(a) must be aborted or must be preceded by T n in any 
serialization, but committed transaction T n _i must precede T n in any serialization to respect the 
real-time ordering of transactions. But then readn(X) must return the new value nv of X that is 
updated by T n _ 1—contradiction. 

Since, by Definition [lj the execution fragment 7 executed by some process p n -i 0 ({ p n } U Si U 
• • • U Si) contains no nontrivial events to any base object accessed in 07 • • • 07, it must contain a 
nontrivial event to some base object 1 ^ {b\, ... ,6j} that is accessed by T n in the execution 
fragment a. 

Let A denote the set of all finite ({ p n } U 5 j ... U Si )-free execution fragments that extend a. 
Let bi + 1 ^ {bi ,..., bi} be the first base object accessed by T n in the execution fragment a to which 
some transaction applies a nontrivial event in the execution fragment a' E A. Clearly, some such 
execution a ■ a' exists that contains a nontrivial event in a' to some distinct base object 6j+i not 
accessed in the execution fragment o\ • • • 07. We choose the execution a ■ a' E A that maximizes 
the number of transactions that are poised to apply nontrivial events on 6j_|_i in the configuration 
after a ■ a'. Let Si + \ denote the set of processes executing these transactions and k' = |S'j + i| 
(k' > 0 as already proved). 

We now construct a (k + A/)-stall execution a ■ a' ■ 07 • • • 07 • 07_|_i for readn(X), where in 07+1, 
p n applies events by itself, then each of the processes in S) + i applies a nontrivial event on b {+\, 
and finally, p n accesses 6*+i. 

By construction, a ■ ol is p n - free. Let 07+1 be the prefix of a not including T n ’s first access 
to b t+ i, concatenated with the nontrivial events on 1 by each of the k' transactions executed 
by processes in Si+\ followed by the access of bi+i by T n . Observe that T n performs exactly one 
t-operation readn(X) in the execution fragment 07 • • • 07+1 and 07 • • • 07+1 contains no events of 
processes not in ({ p„} U Si U • • • U Si U S)+1). 

To complete the induction, we need to show that in every ({ p n } U Si U • • ■ U S, U S) + i)-free 
extension of a ■ a', no transaction applies a nontrivial event to any base object accessed in the 
execution fragment 07 • • • 07-07+1. Let (3 be any such execution fragment that extends a-a'. By our 
construction, 07+1 is the execution fragment that consists of events by p n on base objects accessed 
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RAZA ^ v Wi(X u nv) tryC 1 R m (Z m ) -> v W m (X m ,nv) tryC m 

Ti I-1---1- T m I-1-1-- 


(a) Transactions in {Ti,..., T m };m = n — 3 are mutually read-write 
disjoint-access and concurrent; they are poised to apply a nontrivial 
primitive 


Ri(Zi) —> v Wi(Xi,nv) tryC. Rm(Zm) v W T!l (X rn . nv) tryC m R n (Xi) —>■ v R n (Xj)v 

Ti I-1-1-- T m I-1-t-- T„ I-1 I-1 

(b) T n performs m reads; each readn(Xj) returns initial value v 


Ti h 


Ri{Zi) - 


v Wi (A'i . nv) 

H-h 


tryC t 


Rm(Z m ) — » v Wm(X m ,nv) tryC m W n - 2 {Zj,nv) R n (Xi 

I I - I Tji —2 I I T n I — 


RAX,] 


(c) Tii-2 commits; T n is read-write disjoint-access with T n - 2 


Wi(2Ci,««) tryC t Rm(Zm)-*V tryC m W»-,)Z,,nv) *.(*,)-M, 

Ti I-1-1- T m I-1- 1 - T„_2 I-1 r„ I-1 


K(Xj) « 


(event of T.) - 

• r„_, 1-1 1-— 


(d) Suppose readn(Xj) does not perform a RAW/AWAR, T n and T n _i are unaware of step contention and 
T n misses the event of Tj, but R„-i(Xj) returns the value of Xj that is updated by Tj 


Figure 3 : Executions in the proof of Theorem 6 execution in 3 d is not opaque 


in 01 • • • <Tj, nontrivial events on l>i + \ by transactions in Si + 1 and finally, an access to b l+ \ by p n . 
Since a-(T\ ■ ■ ■ cq is a fe-stall execution by our induction hypothesis, a'■ /3 is ({p n }U 5 i .. .U 5 i})-free 
and thus, a' ■ (3 does not contain nontrivial events on any base object accessed in (j\ ■ • ■ at. We 
now claim that f 3 does not contain nontrivial events to bi + \. Suppose otherwise. Thus, there 
exists some transaction T' that has an enabled nontrivial event to b t +i in the configuration after 
a ■ a' ■ /T, where fi' is some prefix of f 3 . But this contradicts the choice of a ■ a! as the extension 
of a that maximizes k'. 

Thus, a ■ a' • (T\ ■ ■ • at • 0j + i is indeed a (k + fc')-stall execution for T n where 1 < k < {k + k') < 
(n — 1). □ 

4.3 RAW/AWAR complexity 

Attiya et al. | 3 | identified two common expensive synchronization patterns that frequently arise in 
the design of concurrent algorithms: read-after-write (RAW) and atomic write-after-read (AWAR). 
In this section, we prove that opaque, RW DAP TM implementations in OT have executions in 
which some read-only transaction performs a linear (in n ) number of RAWs or AWARs. 

We recall the formal definitions of RAW and AWAR from [3]. Let tt 1 denote the z-th event in 
an execution n {i = 0 ,..., 17 r| — 1 ). 

We say that a transaction T performs a RAW (read-after-write) in 7r if 3 i,j; 0 < i < j < 17 r| 
such that ( 1 ) 7 T* is a write to a base object b by T, ( 2 ) 7 x 3 is a read of a base object b' b by T 
and ( 3 ) there is no 7r fc such that i < k < j and ir k is a write to b' by T. In this paper, we are 
concerned only with non-overlapping RAWs, i.e., the read performed by one precedes the write 
performed by the other. 

We say a transaction T performs an AWAR (atomic-write-after-read) in 7 r if 3 i ,0 < i < 17 r| 
such that the event 7 r* is the application of a nontrivial primitive that atomically reads a base 
object b and writes to b. 

Theorem 6 . Every RW DAP opaque TM implementation M 6 OT has an execution E in which 
some read-only transaction T E txns(E) performs D(n) non-overlapping RAW/AWARs. 

Proof. For all j E {1 m = n — 3 , let u be the initial value of t-objects Xj and Zj. 

Throughout this proof, we assume that, for all i E { 1 ,..., n}, transaction T* is invoked by process 
Pi- 
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By OF TM-progress and OF TM-liveness, any opaque and RW DAP TM implementation 
M G OF has an execution of the form p\ ■ ■ ■ p m , where for all j G { 1 ,..., m}, pj denotes the 
t-complete step contention-free execution of transaction Tj that performs readj(Zj) —> v, writes 
value nv ^ v to Xj and commits. 

By construction, any two transactions that participate in p\ - ■ -p n are mutually read-write 
disjoint-access and cannot contend on the same base object. It follows that for all 1 < j < m, p 3 
is an execution of M. 

For all j G { 1 ,..., m}, we iteratively define an execution pj of M as follows: it is the longest 
prefix of pj such that p\ ■ ■ ■ pj cannot be extended with the complete step contention-free ex¬ 
ecution fragment of transaction T n that performs j t-reads: read n (X\) ■ ■ ■ readn(Xj) in which 
readn(Xj) —> nv nor with the complete step contention-free execution fragment of transaction T n _\ 
that performs j t-reads: readn-i(Xi) ■ ■ ■ read n -\(Xj) in which readn-i(Xj) —> nv (Figure [ 3 a|). 

For any j G { 1 ,..., m }, let e 3 be the event transaction Tj is poised to apply in the configuration 
after p\- ■ ■ pj. Thus, the execution p\- ■■ pj- e 3 can be extended with the complete step contention- 
free executions of at least one of transaction T n or T n _i that performs j t-reads of Xi ,..., Xj in 
which the t-read of Xj returns the new value nv. Let T ra _i be the transaction that must return 
the new value for the maximum number of Ay’s when p\ ■ ■ ■ p 3 ■ ej is extended with the t-reads of 
X \,... ,Xj. We show that, in the worst-case, transaction T n must perform |~y~| non-overlapping 
RAW/AWARs in the course of performing m t-reads of X],..., X m immediately after pi-••p m . 
Symmetric arguments apply for the case when T n must return the new value for the maximum 
number of Ay’s when p\ - ■ ■ pj ■ ej is extended with the t-reads of X \,..., Xj. 

Proving the RAW/AWAR lower bound. We prove that transaction T n must perform [~y~| 
non-overlapping RAWs or AWARs in the course of performing m t-reads of X\,..., X m imme¬ 
diately after the execution pi - ■■ p m - Specifically, we prove that T n must perform a RAW or an 
AWAR during the execution of the t-read of each Xj such that p\ - • • Pj • ej can be extended 
with the complete step contention-free execution of T n _i as it performs j t-reads of X\ .. .Xj in 
which the t-read of Xj returns the new value nv. Let J denote the of all j G { 1 ,..., m} such 
that p\ ■ ■ ■ pj ■ ej extended with the complete step contention-free execution of i performing j 
t-reads of X\... Xj must return the new value nv during the t-read of Xj. 

We first prove that, for all j G J, M has an execution of the form p\ - ■ ■ p m ■ $j (Figures 3 a 


and 3 b), where 5 j is the complete step contention-free execution fragment of T n that performs j 
t-reads: readn(X i) • • • read n (Xj), each of which return the initial value v. 

By definition of pj , OF TM-progress and OF TM-liveness, M has an execution of the form 
pi ■ ■ ■ pj ■ dj. By construction, transaction T n is read-write disjoint-access with each transaction 
T G {Tj + 1 , ... , T m } in pi ■ ■ • Pj ■ ■ ■ p m • dj. Thus, T n cannot contend with any of the transactions in 
{T j+ 1 ,..., T m }, implying that, for all j G { 1 ,..., rn], M has an execution of the form pi ■ ■ ■ p m • dj 
(Figure [ 3 b]). 

We claim that, for each j G J, the t-read of Xj performed by T n must perform a RAW or an 
AWAR in the course of performing j t-reads of Xi ,..., Xj immediately after pi - • • p m - Suppose 
by contradiction that reacL, l (Xj) does not perform a RAW or an AWAR in pi ■ ■ ■ p m ■ d m . 

Claim 7 . For all j G J, M has an execution of the form pi ■ ■ ■ Pj ■ ■ ■ p m • $j -1 ' e j ' P where, /3 is 
the complete step contention-free execution fragment of transaction T n _i that performs j t-reads: 
readn-i(Xi) ■ ■ ■ read n -i(X J _i) • readn-i(Xj) in which read ll -i(Xj) returns nv. 


Proof. We observe that transaction T n is read-write disjoint-access with every transaction T G 
{Tj , Tj + 1,..., T m } in pi • • • pj ■ ■ ■ p m ■ dj-1. By RW DAP, it follows that M has an execution of 
the form pi ■ ■ ■ Pj ■ ■ ■ p m ■ dj -1 • ej since T n cannot perform a nontrivial event on the base object 
accessed by Tj in the event ej. 

By the definition of pj , transaction T n _i must access the base object to which Tj applies a 
nontrivial primitive in ej to return the value nv of Xj as it performs j t-reads of X\,.... X 3 
immediately after the execution pi ■ ■ ■ Pj ■ ■ ■ p m ■ dj-1 ■ ej. Thus, M has an execution of the form 
Pi Pj ■ fij —l ' Li ' fi¬ 


ll 




By construction, transactions T n _i is read-write disjoint-access with every transaction T E 
{T ,+!,-■ ., T m } in pi ■ ■ ■ Pj ■ ■ ■ p m ■ fij—i ■ e j ■ P- It follows that M has an execution of the form 
Pi ' ' ' Pj ' Pm ' &j—l ' ' P- I—I 

Claim 8. For all j E {1, ..., m}, M has an execution of the form p\ ■ ■ ■ pj ■ ■ ■ p m ■ 7 • Sj- 1 • ej ■ P, 
where 7 is the t-complete step contention-free execution fragment of transaction T n _2 that writes 
nv v to Zj and commits. 


Proof. Observe that T n _ 2 precedes transactions T n and T n _ 1 in real-time order in the above 
execution. 

By OF TM-progress and OF TM-liveness, transaction T n -2 must be committed in pi ■ ■ ■ Pj ■ ■ ■ p m - 

7 - 

Since transaction T n _ 1 is read-write disjoint-access with T n _2 in pi • • • pj ■ ■ ■ p m ■ 7 • Sj-i ■ ej ■ P, 
T n - 1 does not contend with T n _2 on any base object (recall that we associate an edge with t- 
objects in the conflict graph only if they are both contained in the write set of some transaction). 
Since the execution fragment P contains an access to the base object to which Tj performs a 
nontrivial primitive in the event ej, T„_2 cannot perform a nontrivial event on this base object 
in 7. It follows that M has an execution of the form pi ■ ■ ■ Pj ■ ■ ■ p m ■ 7 ■ 8 j —1 • ej • P since, it is 
indistinguishable to T n _ 1 from the execution pi ■ ■ • Pj ■ ■ ■ p m ' $j- 1 ' e j ' P (the existence of which is 
already established in Claim [ 7 ]). □ 

Recall that transaction T n is read-write disjoint-access with T n _2 in pi • ■ • pj ■ ■ ■ p m -~f-5j. Thus, 
M has an execution of the form pi • • • Pj ■ ■ ■ p m ■ 7 ■ Sj (Figure [ 3 c]). 

Deriving a contradiction. For all j E {1,... , m}, we represent the execution fragment 5j as 
Sj-i ■ 7 P , where 7 r J is the complete execution fragment of the j th t-read readn(Xj) —> v. By our 
assumption, it 3 does not contain a RAW or an AWAR. 

For succinctness, let a = pi ■ ■ ■ p m • 7 • Sj-1. We now prove that if 7 x 3 does not contain a RAW 
or an AWAR, we can define = tt 3 to construct an execution of the form a ■ tc[ ■ ej ■ P ■ 

(Figure [ 3 d]) such that 

• no event in tt( is the application of a nontrivial primitive 

• a ■ 7i"j • Cj ■ P ■ 712 is indistinguishable to T n from the step contention-free execution a ■ tt\ ■ 

• a ■ ■ ej ■ P ■ 7T2 is indistinguishable to T n _i from the step contention-free execution a. - ej - p. 

The following claim defines 7rj and ir^ to construct this execution. 

Claim 9. For all j E {!,..., m}, M has an execution of the form a ■ 7rj • ej ■ P ■ ir^. 


Proof. Let t be the first event containing a write to a base object in the execution fragment ir 3 . 
We represent 7 t 3 as the execution fragment 7rj ■ t ■ 7 Tj. Since 7 rj does not contain nontrivial events 
that write to a base object, a ■ Tr{ ■ ej ■ P is indistinguishable to transaction T n _ 1 from the step 
contention-free execution a ■ ej • P (as already proven in Claim [i]) . Consequently, a ■ tt\ ■ Cj ■ P is 
an execution of M. 

Since t is not an atomic-write-after-read, M has an execution of the form a ■ 7 • 7rj • ej ■ P ■ t. 
Secondly, since 7 t 3 does not contain a read-after-write, any read of a base object performed in 7 
may only be performed to base objects previously written in t ■ 7 Tj. Thus, a ■ • ej • P ■ t ■ 

is indistinguishable to T n from the step contention-free execution a ■ ■ t ■ tt^. But, as already 

proved, a ■ 7 x 3 is an execution of M. 

Choosing ir^ = t ■ 7rj., it follows that M has an execution of the form a ■ tt[ ■ ej ■ P ■ □ 


We have now proved th at, f or all j E { 1 ,..., m}, M has an execution of the form pi - ■ ■ p m ■ 7 ■ 
Sj -1 ■ tt\ ■ ej ■ p ■ it2 (Figure 3 d). 

The execution in Figure [ 3 d] is not opaque. Indeed, in any serialization the following must hold. 
Since T n _ 1 reads the value written by Tj in Xj, Tj must be committed. Since readL n {Xj) returns 
the initial value v, T n must precede Tj. The committed transaction T n _2, which writes a new 
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value to Zj, must precede T n to respect the real-time order on transactions. However, Tj must 
precede T n _2 since readj(Zj) returns the initial value of Zj. The cycle Tj —> T n - 2 —> T n — > Tj 
implies that there exists no such a serialization. 

Thus, for each j G JJ, transaction T n must perform a RAW or an AWAR during the t-read 
of Xj in the course of performing m t-reads of X\,..., X rn immediately after p\---p m . Since 
|JJ| > f -"'2 3 ' 1 ], in the worst-case, T n must perform Ll(n) RAW/AWARs during the execution of m 
t-reads immediately after p\ - ■ ■ p m . □ 

5 Upper bound for opaque progressive TMs 

In this section, we describe a progressive, opaque TM implementation LP (Algorithm [I ) that is 
not subject to any of the lower bounds inherent to implementations in OT (cf. Figure Tj). Our 
implementation satisfies strict DAP, every transaction performs at most a single RAW and every 
t-read operation incurs 0 ( 1 ) memory stalls in any execution. 

Base objects. For every t-object Xj, LP maintains a base object Vj that stores the value of Xj. 
Additionally, for each Xj, there is a bit Lj, which if set, indicates the presence of an updating 
transaction writing to Xj. For every process pi and t-object Xj, LP maintains a single-writer bit 
rij (only pi is allowed to write to rjj ). Each of these base objects may be accessed only via read 
and write primitives. 

Updating transactions. The write k (X, v ) implementation by process pi simply stores the value 
v locally, deferring the actual updates to tryC k . During tryC k , process pi attempts to obtain 
exclusive write access to every Xj G Wset(T k ). This is realized through the single-writer bits, 
which ensure that no other transaction may write to base objects Vj and Lj until T k relinquishes 
its exclusive write access to Wset{T k ). Specifically, process pi writes 1 to each r ij, then checks 
that no other process pt has written 1 to any rtj by executing a series of reads (incurring a single 
RAW). If there exists such a process that concurrently contends on write set of T k , for each 
Xj G Wset(T k ), pi writes 0 to r tJ and returns A k . If successful in obtaining exclusive write access 
to Wset(T k ), pi sets the bit Lj for each Xj in its write set. Implementation of tryC k now checks 
if any t-object in its read set is concurrently contended by another transaction and then validates 
its read set. If there is contention on the read set or validation fails, indicating the presence of 
a concurrent conflicting transaction, the transaction is aborted. If not, pi writes the values of 
the t-objects to shared memory and relinquishes exclusive write access to each Xj G Wset{T k ) by 
writing 0 to each of the base objects Lj and r\j . 

Read operations. The implementation first reads the value of t-object Xj from base object Vj 
and then reads the bit Lj to detect contention with an updating transaction. If Lj is set, the 
transaction is aborted; if not, read validation is performed on the entire read set. If the validation 
fails, the transaction is aborted. Otherwise, the implementation returns the value of Xj. For a 
read-only transaction T k , tryC k simply returns the commit response. 

Complexity. Observe that our implementation uses invisible reads since read-only transactions 
do not apply any nontrivial primitives. Any updating transaction performs at most a single RAW 
in the course of acquiring exclusive write access to the transaction’s write set. Consequently, every 
transaction performs 0 ( 1 ) non-overlapping RAWs in any execution. 

Recall that a transaction may write to base objects Vj and Lj only after obtaining exclusive 
write access to t-object Xj, which in turn is realized via single-writer base objects. Thus, no 
transaction performs a write to any base object b immediately after a write to b by another 
transaction, i.e., every transaction incurs only 0 ( 1 ) memory stalls on account of any event it 
performs. Since the read k (Xj) implementation only accesses base objects Vj and Lj, and the 
validating T k s read set does not cause any stalls, it follows that each t-operation performs 0 ( 1 ) 
stalls in every execution. 

Moreover, LP ensures that any two transactions Tj and Tj access the same base object iff 
there exists X G Dset(Ti ) n Dset(Tj ) (strict DAP) and maintains exactly one version for every 
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t-object at any prefix of the execution. 

Theorem 10. Algorithm [I] describes a progressive, opaque and strict DAP TM implementation 
LP that provides wait-free TM-liveness, uses invisible reads and in every execution E of LP, 

• every transaction T £ txns(E) applies only read and write primitives in E, 

• every transaction T £ txns(E) performs at most a single RAW, 

• for every transaction T £ txns(E), every t-read operation performed by T incurs 0(1) mem¬ 
ory stalls in E. 


6 Related work 


The lower bounds and impossibility results presented in this paper apply to obstruction-free TMs, 
such as DSTM |20|, FSTM |13j, and others [13, 25 30 . Our upper bound is inspired by the 


FSTM |1 

progressive TM of [23]. 

Attiya et al. [5] were the first to formally define DAP for TMs. They proved the impossibility 
of implementing weak DAP strictly serializable TMs that use invisible reads and guarantee that 
read-only transactions eventually commit, while updating transactions are guaranteed to commit 
only when they run sequentially |5|. This class is orthogonal to the class of obstruction-free TMs, 
as is the proof technique used to establish the impossibility. 

Perelman et al. 27 showed that mv-permissive weak DAP TMs cannot be implemented. In 
only updating transactions may be aborted, and only when they conflict 
In particular, read-only transactions cannot be aborted and 


mv-permissive TMs, 
with other updating transactions, 
updating transactions may sometimes be aborted even in the absence of step contention, which 
makes the impossibility result in 


27 unrelated to ours. 


Guerraoui and Kapalka | IT] proved that it is impossible to implement strict DAP obstruction- 
free TMs. They also proved that a strict serializable TM that provides OF TM-progress and 
wait-free TM-liveness cannot be implemented using only read and write primitives. We show that 
progressive TMs are not subject to either of these lower bounds. 

Attiya et al. introduced the RAW/AWAR metric and proved that it is impossible to derive 
RAW/AWAR-free implementations of a wide class of data types that include sets, queues and 
deadlock-free mutual exclusion. The metric was previously used in 123 to measure the complexity 
of read-only transactions in a strictly stronger (than OP) class of permissive TMs. Detailed 
coverage on memory fences and the RAW/AWAR metric can be found in |26|. 

To derive the linear lower bound on the memory stall complexity of obstruction-free TMs, we 
adopted the definition of a k-stall execution and certain proof steps from 2,10 . 


7 Discussion 


Lower bounds for obstruction-free TMs. 


We chose obstruction-freedom to elucidate non- 

As highlighted in 


blocking TM-progress since it is a very weak non-blocking progress condition 21 
the paper by Ennals |12], (1) obstruction-freedom increases the number of concurrently executing 
transactions since transactions cannot wait for inactive transactions to complete, and (2) while 
performing a t-read, obstruction-free TMs like [ 13 20 must forcefully abort pending conflicting 


transactions. Intuitively, (1) allows us to construct executions in which some pending transaction 
is stalled while accessing a base object by all other concurrent transactions waiting to apply 
nontrivial primitives on the base object. Observation (2) inspires the proof of the impossibility of 
invisible reads in Theorem [2] Typically, the reading transaction must acquire exclusive ownership 
of the object via mutual exclusion or employing a read-modify-write primitive like compare-and- 
swap, motivating the linear lower bound on expensive synchronization in Theorem [6] In practice 
though, obstruction-free TMs may possibly circumvent these lower bounds in models that allow 
the use of contention managers [28]. 
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Observe that Theorems [2] and [5] assume strict serializability and thus, also hold under the 
assumption of stronger TM-correctness conditions like opacity, virtual-world consistency |:22 and 
TMS |9|. 

Since there are at most n concurrent transactions, we cannot do better than (n — 1) stalls (cf. 
Definition [I]). Thus, the lower bound of Theorem[5]is tight. Moreover, we conjecture that the linear 
(in n) lower bound of Theorem [h] for RW DAP opaque obstruction-free TMs can be strengthened 
to be linear in the size of the transaction’s read set. Then, Algorithm [2] in Appendix [B] would 
allow us to establish a linear tight bound (in the size of the transaction’s read set) for RW DAP 
opaque obstruction-free TMs. 


Progressive vs. obstruction-free TMs. Progressiveness is a blocking TM-progress condition 
that is satisfied by several popular TM implementations like TL2 [7] and NOrec |6). In general, 
progressiveness and obstruction-freedom are incomparable. On the one hand, a t-read X by a 
transaction T that runs step contention-free from a configuration that contains an incomplete 
t-write to X is typically blocked or aborted in lock-based TMs; obstruction-free TMs however, 
must ensure that T must complete its t-read of X without blocking or aborting. On the other 
hand, progressiveness requires two non-conflicting transactions to commit even in executions that 
are not step contention-free; but this is not guaranteed by obstruction-freedom. 

Intuitively, progressive implementations are not forced to abort conflicting transactions, which 
allows us to employ invisible reads, derive constant stall and RAW/AWAR implementations. While 
it is relatively easy to derive standalone progressive TM implementations that are not individually 
subject to the lower bounds of obstruction-free TMs (cf. Figure [I]), our progressive opaque TM 
implementation LP is not subject to any of the lower bounds we prove for implementations in 
OP. 


Circa. 2005, several papers presented the case for a shift from TMs that provide obstruction- 
free TM-progress to lock-based progressive TMs [7l[8 12 . They argued that lock-based TMs tend 


to outperform obstruction-free ones by allowing for simpler algorithms with lower overheads and 
their inherent progress issues may be resolved using timeouts and contention-managers. The lower 
bounds for non-blocking TMs and the complexity gap with our progressive TM implementation 
established in this paper suggest that this course correction was indeed justified. 
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A Opaque progressive TM implementation LP 

In this section, we describe our blocking TM implementation LP that satisfies progressiveness and 
opacity |17|. We begin with the formal definition of opacity. 

For simplicity of presentation, we assume that each execution E begins with an “imaginary” 
transaction To that writes initial values to all t-objects and commits before any other transaction 
begins in E. Let E be a t-sequential execution. For every operation readpX ) in E, we define the 
latest written value of X as follows: (1) If T/ contains a writepX, v ) preceding readpX ), then the 
latest written value of X is the value of the latest such write to X. (2) Otherwise, if E contains 
a write m (X,v), T m precedes Tk, and T m commits in E, then the latest written value of X is the 
value of the latest such write to X in E. (This write is well-defined since E starts with To writing 
to all t-objects.) We say that readpX) is legal in a t-sequential execution E if it returns the latest 
written value of X, and E is legal if every readpX) in H that does not return Ak is legal in E. 

For a history H, a completion of H, denoted H, is a history derived from H through the 
following procedure: (1) for every incomplete t-operation opk of T/ £ txns(H) in H, if opk = 
readk\/ writek, insert Ak somewhere after the invocation of opk ; otherwise, if opk = tryC k (), insert 
Ck or Ak somewhere after the last event of T/. (2) for every complete transaction T/. that is not 
t-complete, insert tryC k ■ Ak somewhere after the last event of transaction T/. 

Definition 2. A finite history H is opaque if there is a legal t-complete t-sequential history S, 
such that (1) for any two transactions Tk,T m e txns(H), if T^ -<|| T T m; then Tk precedes T m in 
S, and (2) S is equivalent to a completion of H. 
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Algorithm 1 Strict DAP progressive opaque TM implementation LP; code for 7) : executed by 
process pi 


1: Shared base objects: 

2 : Vj, for each t-object Xj , allows reads and writes 

3 : rij, for each process p; and t-object Xj 

4 : single-writer bit 

5 : allows reads and writes 

6 : Lj, for each t-object Xj 

7 : allows reads and writes 

8 : Local variables: 

9 : Rsetk, Wsetk for every transaction T k - 

10: dictionaries storing {X m , v m } 

11 : readk(Xj): 

12: if Xj fL Rset(Tk) then 

13 : [oi)j,kj] := read(vj) 

14 : Rset(T k ) := Rset(T k ) U {Xj, [ovj, kj}} 

15 : if read(Lj) A 0 then 

16: Return A k 

17 : if validate!) then 

18: Return A k 

19 : Return ovj 

20 : else 

2i: [oiij,_L] := Rset(T k ).\ocate(Xj) 

22: Return ovj 

23 : writek (Xj , v)i 
24 : TlVj ’.= V 

25 : Wset(T k ) := Wset{T k ) U {Xj} 

26: Return ok 

27 : tryC k (): 

28: if | Wset(T k )\ = 0 then 

29 : Return C k 

30 : locked : = acquire( Wset(T k )) 

3 i: if —i locked then 

32 : Return A k 

33 : if isAbortable() then 

34 : release( Wset(T k )) 

35 : Return A k 

// Exclusive write access to each Vj 

36 : for all Xj £ Wset(T k ) do 

37 : write{vj , [ nvj , k] ) 

38 : release( Wset(T k )) 

39 : Return C k 


40: Function: release(Q): 

41: for all Xj £ Q do 

42: write(L j,0) 

43: for all Xj £ Q do 

44: write(rij,0) 

45: Return ok 

46: Function: acquire(Q): 

47: for all Xj £ Q do 

48: write(rij, 1) 

49: if 3Xj £ Q;t ^ k : read(r t j ) = 1 then 

50: for all Xj £ Q do 

si: write(rij, 0) 

52: Return false 

// Exclusive write access to each Lj 

53: for all Xj £ Q do 

54: write(Lj, 1) 

55: Return true 

56: Function: isAbortableQ : 

57: if 3 Xj £ Rset(T k ) : Xj £ Wset(T k ) A read(Lj) ^ 0 

then 

58: Return true 

59: if validate() then 

60 : Return true 

6i: Return false 

62: Function: validateQ : 

// Read validation 

63: if 3 Xj £ Rset(T k ):[oVj,kj] ^ read{vj ) then 

64: Return true 

65: Return false 


A finite history H is strictly serializable if there is a legal t-complete t-sequential history S, 
such that (1) for any two transactions Tk,T m e txns(H), if Tk -<^ T T m , then Tk precedes T m 
in S, and (2) S is equivalent to cseq(H), where H is some completion of H and cseq(H) is the 
subsequence of H reduced to committed transactions in H. 

We refer to S as a serialization of H. 

We now prove that LP implements an opaque TM. 

We introduce the following technical definition: process pi holds a lock on Xj after an execution 
it of Algorithm^ 7 ] if it contains the invocation of acquire(Q), Xj € Q by pi that returned true , but 
does not contain a subsequent invocation of releasefQ' ), Xj e Q', by pi in it. 
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Lemma 11. For any object Xj, and any execution it of Algorithm^ there exists at most one 
process that holds a lock on Xj after it. 


Proof. Assume, by contradiction, that there exists an execution ir after which processes pi and pk 
hold a lock on the same object, say Xj. In order to hold the lock on Xj, process pi writes 1 to 
register Tjj and then checks if any other process pk has written 1 to r k j ■ Since the corresponding 
operation acquire ( Q), Xj e Q invoked by pi returns true, pi read 0 in r k j in Line 49 But then p k 
also writes 1 to r k j and later reads that rjj is 1. This is because p k can write 1 to r k j only after 
the read of r k j returned 0 to pi which is preceded by the write of 1 to Hence, there exists an 
object Xj such that r,, = l;i ^ k, but the conditional in Line [49| returns true to process p k — a 


contradiction. 


□ 


Observation 12. Let ir be any execution of Algorithm [1} Then, for any updating transaction 


(in Line 37) for some Xj 6 


T k e txns (7r) executed by process pt writes to Lj (in Line \o4\ j or Vj 
WsetfTf.) immediately after n iff pi holds the lock on Xj after it. 

Lemma 13. Algorithm^ implements an opaque TM. 

Proof. Let E by any finite execution of Algorithm [lj Let <e denote a total-order on events in E. 

Let H denote a subsequence of E constructed by selecting linearization points of t-operations 
performed in E. The linearization point of a t-operation op, denoted as i op is associated with 
a base object event or an event performed between the invocation and response of op using the 
following procedure. 

Completions. First, we obtain a completion of E by removing some pending invocations and 
adding responses to the remaining pending invocations involving a transaction T\ as follows: every 
incomplete readk, writer operation is removed from E; an incomplete tryC k is removed from E if 


T k has not performed any write to a base object during the release function in Line 38, otherwise 
it is completed by including C k after E. 

Linearization points. Now a linearization H of E is obtained by associating linearization points 
to t-operations in the obtained completion of E as follows: 


• For every t-read op k that returns a non-A^ value, l opk is chosen as the event in Line 13 of 
Algorithm [lj else, £ opk is chosen as invocation event of op k 

• For every op k = write k that returns, £ opk is chosen as the invocation event of op k 

• For every op k = tryC k that returns C k such that Wset(T k ) / 0, £ opk is associated with the 
response of acquire in Line 30 else if op k returns A k , l opk is associated with the invocation 
event of op k 

• For every op k = tryC k that returns C k such that Wset(T k ) = 0, l opk is associated with 
Line [29] 

<H denotes a total-order on t-operations in the complete sequential history H. 

Serialization points. The serialization of a transaction Tj, denoted as St :/ is associated with the 
linearization point of a t-operation performed within the execution of Tj. 

We obtain a t-complete history H from H as follows: for every transaction T k in H that is 
complete, but not t-complete, we insert tryC k ■ A k after H. 

A t-complete t-sequential history S is obtained by associating serialization points to transac¬ 
tions in H as follows: 

• If T k is an update transaction that commits, then 5r k is £tryC k 

• If T k is a read-only or aborted transaction in H, 5T k is assigned to the linearization point of 
the last t-read that returned a non-A^ value in T k 

<s denotes a total-order on transactions in the t-sequential history S. 

Claim 14. If Ti -<h Tj, then Ti <s Tj 

Proof. This follows from the fact that for a given transaction, its serialization point is chosen 
between the first and last event of the transaction implying if Tj -<h Tj, then St, <e implies 
Ti <s Tj. □ 
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Claim 15. Let T k be any updating transaction that returns false from the invocation of isAbortable 
in Line\33[ Then, T\ returns C k within a finite number of its own steps in any extension of E. 

Proof. Observer that T k performs the write to base objects Vj for every Xj E Wset(T k ) and 
then invokes release in Lines [37] and [38] respectively. Since neither of these involve aborting the 
transaction or contain unbounded loops or waiting statements, it follows that T k will return Ck 
within a finite number of its steps. □ 

Claim 16. S is legal. 

Proof. Observe that for every readj(X m ) —> v, there exists some transaction Tj that performs 


writei(X m ,v) and completes the event in Line 37 such that readj(X m ) writei(X. m ,v). More 
specifically, readj(X m ) returns as a non-abort response, the value of the base object v rn and v m 
can be updated only by a transaction Tj such that X m E WsetiTf). Since readj(X m ) returns the 


response v, the event in Line 13 succeeds the event in Line |37| performed by tryC. Consequently, by 
Claim fl5l and the assignment of linearization points, ItryCj <E (-read^Xm)- Since, for any updating 


committing transaction Tj, St, = £-tryC, , by the assignment of serialization points, it follows that 
<E Stj ■ 

Thus, to prove that S is legal, it suffices to show that there does not exist a transaction Tk 
that returns Ck in S and performs writek{X m ,v'); v' v such that T <s Tk <s Tj. Suppose that 
there exists a committed transaction Tk, X m E Wset(T k ) such that Tj <s Tk <s Tj. 

Ti and Tk are both updating transactions that commit. Thus, 


{T% <s Tk) 
(S Ti <e S Tk ) 


Since, ± 0 


> [S Ti <e Sr k ) 

(£tryC i <E £tryC k ) 

Tj reads the value of X written by Ti, one of the following is true: itryCj <E £tryC k <E 
VreadjiXm) or ^tryC, <E £read j (x 7n ) <E £tr y c k ■ Let Tj and T k be executed by processes pt and p k 
respectively. 

Consider the case that £ try c, <e £tr y c k <E £ r ead j {x m )- 

By the assignment of linearization points, T k returns a response from the event in Line 30 
before the read of v m by Tj in Line 13 Since Tj and T k are both committed in E, p k returns true 
from the event in Line 30 only after Tj writes 0 to rj m in Line 44 (Lemma Ell- 

Recall that readj(X m ) checks if X m is locked by a concurrent transaction (i.e Lj 0), then 


performs read-validation (Line 15) before returning a matching response. Consider the following 


possible sequence of events: T k returns true from the acquire function invocation, sets Lj to 1 for 


to shared-memory (Line 37). The 


then reads the base object v rn associated with X m after which T k 


every Xj E Wset(T k ) (Line 54) and updates the value of X, 
implementation of readj(X , 
releases X m by writing 0 to rk m and finally Tj performs the check in Line 15 However, readj(X, 
is forced to return Aj because X m E Rset(Tj 
its value. 


(Line 14) and has been invalidated since last reading 

by writing 1 to r k m and 


Otherwise suppose that T k acquires exclusive access to X 
returns true from the invocation of acquire, updates v. 
check in Line 15 and finally T k releases X m by writing 0 to r k m■ Again, readj(X, 


in Line 37), Tj reads v m , Tj performs the 

,) returns Aj 


since Tj reads that r k m is 1—contradiction. 

Thus, itryCi <E £ r eadj(X) <E hryC k - 

We now need to prove that Stj indeed precedes £tryC k in E. 

Consider the two possible cases: 

• Suppose that Tj is a read-only or aborted transaction in H . Then, St :i is assigned to the 
last t-read performed by Tj that returns a non-Aj value. If readj(X m ) is not the last t-read 
performed by Tj that returned a non-Aj value, then there exists a readj{X z ) performed by Tj 
such that O-readj(x m ) <E £tr y c k <E £ r eadj(x z )- Now assume that i try c k must precede £ re ad : ,(X z ) 
to obtain a legal S. Since T k and Tj are concurrent in E, we are restricted to the case that 
T k performs a write k (X z ,v) and readj(X z ) returns v. However, we claim that this t-read of 
X z must abort by performing the checks in Line 15 Observe that T k writes 1 to L m , L z 
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each (Line 54) and then writes new values to base objects v m , v z (Line 37). Since readj(X z ) 


returns a non-A,- response, X ' k writes 0 to L z before the read of L z by readj(X z ) in Line 15 


Thus, the t-read of X z would return Aj (in Line 17 after validation of the read set since X r 
has been updated— contradiction to the assumption that it the last t-read by Tj to return 
a non-A,- response. 

Suppose that Tj is an updating transaction that commits, then 8r ;j = ttryC :j which implies 
that ^readj(Xm) <E (■ tryC k <E tryCj ■ Then, Tj must necessarily perform the checks in Line 
and read that L m is 1. Thus, 
is a committed transaction. 


Tj must return A—contradiction to the assumption that 


33 


T 

A > 


□ 


The conjunction of Claims 14 and 16 establish that Algorithm [l] is opaque. 


□ 


Theorem 9. Algorithm^ describes a progressive, opaque and strict DAP TM implementation LP 
that provides wait-free TM-liveness, uses invisible reads and in every execution E of LP, 

• every transaction T £ txns(E) applies only read and write primitives in E, 

• every transaction T £ txns(E) performs at most a single RAW, 

• for every transaction T £ txns(E), every t-read operation performed by T incurs 0(1) mem¬ 
ory stalls in E. 


Proof. (TM-liveness and TM-progress) Since none of the implementations of the t-operations in 
Algorithm [l] contain unbounded loops or waiting statements, every t-operation op k returns a 
matching response after taking a finite number of steps in every execution. Thus, Algorithm [T] 
provides wait-free TM-liveness. 

To prove progressiveness, we proceed by enumerating the cases under which a transaction T k 
may be aborted. 

• Suppose that there exists a read k (Xj) performed by T k that returns A k from Line 15 Thus, 


there exists a process pt executing a transaction that has written 1 to rtj in Line 48 but has 
not yet written 0 to rtj in Line 44 or some t-object in Rset{T k ) has been updated since its 
t-read by T](. In both cases, there exists a concurrent transaction performing a t-write to 
some t-object in Rset(T k ). 

Suppose that tryC k performed by T k that returns A k from Line 31 Thus, there exists a 


process pt executing a transaction that has written 1 to rtj in Line 48, but has not yet 
written 0 to rtj in Line 44 Thus, T k encounters step-contention with another transaction 


that concurrently attempts to update a t-object in Wset(T k ). 
Suppose that tryC k performed by T k that returns A k from Line 33 
from Line |33| for the same reason it returns A k after Line 


Since T k returns A k 
the proof follows. 

(Strict disjoint-access parallelism) Consider any execution E of Algorithm [T] and let T* and Tj be 
any two transactions that participate in E and access the same base object b in E. 

• Suppose that T t and Tj contend on base object Vj or Lj. Since for every t-object Xj, there 
exists distinct base objects Vj and Lj, Tj and Tj contend on Vj only if Xj £ Dset(Ti ) n 
Dset(Tj). 

• Suppose that T) and Tj contend on base object rjj. Without loss of generality, let pt be the 
process executing transaction Tp, Xj £ Wset(Tf) that writes 1 to r t j in Line [48j Indeed, 
no other process executing a transaction that writes to Xj can write to r^. Transaction Tj 
reads r^ only if Xj £ Dset(Tj) as evident from the accesses performed in Lines [481 [491 [441 


m 

Thus, Ti and Tj access the same base object only if they access a common t-object. 

(Opacity) Follows from Lemma [l3j 

(Invisible reads) Observe that read-only transactions do not perform any nontrivial events. 
Secondly, in any execution E of Algorithm [lj and any transaction T k £ txns(E), if Xj £ Rset(T k ), 
T k does not write to any of the base objects associated with Xj nor write any information that 
reveals its read set to other transactions. 
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(Complexity) Consider any execution E of Algorithm [I] 

• For any T k £ txns(E), each readk only applies trivial primitives in E while tryC k simply 
returns C k if Wset(T k ) = 0. Thus, Algorithm [I] uses invisible reads. 

• Any read-only transaction T k £ txns(E) not perform any RAW or AWAR. An updating 
transaction T k executed by process pt performs a sequence of writes (Line 48 to base ob¬ 
jects {rij} : Xj £ Wset(T k ), followed by a sequence of reads to base objects {rtj} '■ t £ 
{1 ,... ,n},Xj £ Wset(T k ) (Line 49) thus incurring a single multi-RAW. 

• Let e be a write event performed by some transaction T k executed by process pj in E on 
base objects Vj and Lj (Lines 37 and 54). Any transaction T k performs a write to Vj or L 
only after T k writes 0 to r^, for every Xj £ Wset(T k ). Thus, by Lemmata 
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13 


3 

it 

follows that events that involve an access to either of these base objects incurs 0(1) stalls. 
Let e be a write event on base object r tJ (Line 48) while writing to t-object Xj. By Al¬ 
gorithm [TJ no other process can write to rij. It follows that any transaction T k £ txns(E) 
incurs 0(1) memory stalls on account of any event it performs in E. Observe that any t-read 
readk(Xj) only accesses base objects Vj, Lj and other value base objects in Rset{T k ). But as 
already established above, these are 0(1) stall events. Hence, every t-read operation incurs 
0(l)-stalls in E. 

□ 


B Obstruction-free TMs 

B.l An opaque RW DAP TM implementation M £ OT 
Lemma 10. Algorithm^ implements an opaque TM. 

Proof. Since opacity is a safety property, we only consider finite executions |4|. Let E by any 
finite execution of Algorithm [2] Let <e denote a total-order on events in E. 

Let H denote a subsequence of E constructed by selecting linearization points of t-operations 
performed in E. The linearization point of a t-operation op, denoted as £ op is associated with a 
base object event or an event performed during the execution of op using the following procedure. 
Completions. First, we obtain a completion of E by removing some pending invocations and 
adding responses to the remaining pending invocations involving a transaction T k as follows: every 
incomplete readk , write k , tryC k operation is removed from E\ an incomplete write k is removed 
from E. 

Linearization points. We now associate linearization points to t-operations in the obtained 
completion of E as follows: 

• For every t-read op k that returns a non-A*. value, £ opk is chosen as the event in Line 
Algorithm [2j else, £ opk is chosen as invocation event of op k 

• For every t-write op k that returns a non-A^ value, £ opk is chosen as the event in Line 
Algorithm [2j else, £ opk is chosen as invocation event of op k 

• For every op k = tryC k that returns C k , £ opk is associated with Line 
<H denotes a total-order on t-operations in the complete sequential history H. 

Serialization points. The serialization of a transaction T), denoted as 8 t :i is associated with the 
linearization point of a t-operation performed during the execution of the transaction. 

We obtain a t-complete history H from H as follows: for every transaction T k in H that is 
complete, but not t-complete, we insert tryC k ■ A k after H. 

H is thus a t-complete sequential history. A t-complete t-sequential history S equivalent to H 
is obtained by associating serialization points to transactions in H as follows: 

• If T k is an update transaction that commits, then Sx k is £tryC k 

• If T k is an aborted or read-only transaction in H, then Sx k is assigned to the linearization 
point of the last t-read that returned a non-A^ value in T k 

<S denotes a total-order on transactions in the t-sequential history S. 
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Shared base objects: 36 

tvar[m], storing [owner m , oval m , rival m \ 37 

for each t-object X„, , supports read, write, cas38 
ownerm, a transaction identifier 
ovalm £ V 
nvalm £ V 

status[k] £ {live, aborted, committed}, 
for each T*,; supports read, write, cas 

Local variables: 

Rsetk, Wsetk for every transaction T, j.; 
dictionaries storing {X m , Tvar[m]} 


readfc (Xm): 

[ownerm, oval m , nval„ 


tvar[m]. read() 


if owner m ^ k then 

s m <— status [owner m ]. read () 
if s m = committed then 
curr = nvalm 

else if s m = aborted then 
curr = ovalm 
else 

if status[ownerm]-cas(live, aborted) then 
curr = ovalm 

else 

Return Ak 

if status[k] = live A -ivalidateQ then 

Rset{Tk). add({X m , [owner m , oval m , nval m }}) 
Return curr 
Return Ak 
else 

Return Rset(Tk). locate(X m ) 

Function: validate(): 

if 3 {Xj, [ownerj, ovalj, nvalj]} £ Rset{Tk)\ 
{[ownerj, ovalj, nvalj] ^ tvar[j], read()) then 
Return true 
Return false 


write k (X m > v ) : 

[owner m , ovalm, nvalm] <— tvar[m\. read() 
if ownerm 7 ^ k then 
Sm £- status[ownerm\- read() 
if Sm = committed then 
curr = nvalm 

else if s m = aborted then 
curr = ovalm 
else 

if status[owner m ]-cas(live, aborted) then 
curr = ovalm 

else 

Return A). 

0m tmr[m].cas([owner m , ovalm, nvalm], [k, curr, v]) 

if Om A status[k] = live then 
Wsetk- add({X m , [k, curr,v]}) 

Return ok 
else 

Return Ak 

else 

[owner m , ovalm, nvalm] = Wsetk .locate(X m ) 
s = tvar[m].cas([owner m , oval m , nvalm], [k, oval m ,v ]) 

if s then 

Wset{T k ).add({Xm, [k, oval m ,v]}) 

Return ok 
else 

Return Ak 


tryC k ()-- 

if validate() then 
Return Ak 

if status[k[.cas{live, committed) then 
Return Ck 
Return Ak 


Claim 11. IfTi -<^ T Tj, then Ti <5 Tj. 


Proof. This follows from the fact that for a given transaction, its serialization point is chosen 
between the first and last event of the transaction implying if Tj -<h Tj, then Sx, <e implies 
T <s Tj □ 

Claim 12. If transaction Ti returns Ci in E, then status[i]=committed in E. 

Proof. Transaction Tj must perform the event in Line [66] before returning Ti i.e. the cas on its 
own status to change the value to committed. The proof now follows from the fact that any other 
transaction may change the status of Tj only if it is live (Lines 45 and 21). □ 


Claim 13. S is legal. 

Proof. Observe that for every readj(X) -A v, there exists some transaction Tj that performs 
writei(X,v ) and completes the event in Line 49 to write v as the new value of X such that 
readj(X) writei(X,v). For any updating committing transaction Tj, Sx, = t-tryC Since 

readj(X) returns a response v, the event in Line 


13 


must succeed the event in Line 66 when T, 


changes status [if to committed. Suppose otherwise, then readj(X) subsequently forces Ti to abort 
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by writing aborted to status[ij and must return the old value of X that is updated by the previous 
owner of X, which must be committed in E (Line 40). Since 8t, = (t.ryC, precedes the event in 
it follows that 5 Ti <E (ready (X)- 


Line 


66 


We now need to prove that 5r t <e 8t, ■ Consider the following cases: 

• if Tj is an updating committed transaction, then 5t 3 is assigned to (tryCy But since 
(ready {X) <E (tryCy, it follows that by, <E ^Ty ■ 

• if Tj is a read-only or aborted transaction, then 5t is assigned to the last t-read that did 
not abort. Again, it follows that Sr, <e 8t 3 - 

To prove that S is legal, we need to show that, there does not exist any transaction T k that 
returns C k in S and performs write k {X,v')\ v' / v such that Tj <g T k <$ Tj. Now, suppose by 
contradiction that there exists a committed transaction Tj, I £ Wset{T k ) that writes v' 7 ^ v to 
X such that Tj <5 T k <5 Tj. Since Tj and T k are both updating transactions that commit, 

(T < s T k ) <=^> {5 Ti <e S Tk ) 

(5 Tl <E $T k ) {(tryC, <E (tryC k ) 

Since, Tj reads the value of X written by Tj, one of the following is true: (tryC, <E (tryC k <E 
(ready (X) Or (tryCy <E ( r eady(X) <E (tryC k ■ 

If (tryC i <E (tryC k <E (ready (X)^ then the event in Line [ 66 ] performed by T k when it changes the 
status field to committed precedes the event in Line [13] performed by Tj. Since (tryC, <E (tryC k 
and both Tj and T k are committed in E. T k must perform the event in Line 37 after Tj changes 


status[i] to committed since otherwise, T k would perform the event in Line 45 and change status [if 
to aborted, thereby forcing Tj to return Ay. However, readj(X) observes that the owner of X is 
T k and since the status of T k is committed at this point in the execution, readj(X) must return 
v' and not v —contradiction. 

Thus, ltryCi <E ^read (x) <E £ tryC k ■ We now need to prove that 5t :i indeed precedes Sx k = ttryC k 
in E. 

Now consider two cases: 

• Suppose that Tj is a read-only transaction. Then, 8 t 3 is assigned to the last t-read performed 
by Tj that returns a non-Aj value. If readj(X) is not the last t-read that returned a non-Aj 
value, then there exists a readj(X') such that l rea dy{x) <E £ tryC k <E ^readj(X')- But then 
this t-read of X' must abort since the value of X has been updated by T k since Tj first read 
X —contradiction. 

• Suppose that Tj is an updating transaction that commits, then 5t 3 = £ tryC 3 which implies 
that (-ready (x) <E ( tryC k <E & tryCy Then, Tj must neccesarily perform the validation of its 


read set in Line [651 and return Aj —contradiction. 


Claims [TT] and [13] establish that Algorithm [2] is opaque. 


□ 


□ 


Theorem 14. Algorithm [S] describes a RW DAP, opaque TM implementation M e OT such that 
every execution E of M is a 0(n)-stall execution for any t-read operation and every read-only 
transaction T € txns(E) performs 0(\Rset(T)\) AWARs in E. 


Proof. (Opacity) Follows from Lemma 10 


(TM-liveness and TM-progress) Since none of the implementations of the t-operations in Algo¬ 
rithm [2] contain unbounded loops or waiting statements, every t-operation op k returns a matching 
response after taking a finite number of steps. Thus, Algorithm [2] provides wait-free TM-liveness. 

To prove OF TM-progress, we proceed by enumerating the cases under which a transaction T k 
may be aborted in any execution. 

• Suppose that there exists a read k (X m ) performed by T k that returns A k . If read k {X m ) 
returns A k in Line [28j then there exists a concurrent transaction that updated a t-object in 
Rset(T k ) or changed status[k] to aborted. In both cases, T k returns A k only because there is 
step contention. 
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Suppose that there exists a writek{X m , v ) performed by T). that returns Ak in Line 54 Thus, 


either a concurrent transaction has changed status[k] to aborted or the value in tvar[m] has 
been updated since the event in Line [37] In both cases, T& returns Ak only because of step 
contention with another transaction. 

• Suppose that a readk{X m ) or writek(X m , v) return Ak in Lines [2T] and [45] respectively. Thus, 
a concurrent transaction has takes steps concurrently by updating the status of owner m 
since the read by Tk in Lines 13 and [37] respectively. 

• Suppose that tryC k () returns Ak in Line [62] This is because there exists a t-object in 
Rset(Tk ) that has been updated by a concurrent transaction since i.e. tryC k () returns Ak 
only on encountering step contention. 

It follows that in any step contention-free execution of a transaction Tk from a Tk- free execution, 
Tk must return Cj~ after taking a finite number of steps. 

(Read-write disjoint-access parallelism) Consider any execution E of Algorithm [2] and let T 
and Tj be any two transactions that contend on a base object b in E. We need to prove that 
there is a path between a t-object in Dset(T ) and a t-object in Dset{Tj ) in G(T,Tj , E) or there 
exists X G Dset{Ti) n Dset(Tj). Recall that there exists an edge between t-objects X and Y in 
G(Ti,Tj,E) only if there exists a transaction T G txns(E) such that {X,Y} £ Wset(T). 

• Suppose that Tj and Tj contend on base object tvar[ m] belonging to t-object X m in E. By 
Algorithm [2j a transaction accesses X m only if X m is contained in Dset(T m ). Thus, both Tj 
and Tj must access X m . 

• Suppose that Tj and Tj contend on base object status[ i] in E (the case when Tj and Tj 


contend on status[ j] is symmetric), 
t-object X in Lines 


15 


and 


21 


Tj accesses statusfij while performing a t-read of some 
only if Tj is the owner of X. Also, Tj accesses status[ij while 
performing a t-write to X in Lines 39 and 45 only if Tj is the owner of X. But if Tj is the 
owner of X, then X G WsetiTf). 

• Suppose that Tj and Tj contend on base object status[ m] belonging to some transaction T m 
in E. Firstly, observe that Tj or Tj access status[m] only if there exist t-objects X and Y 
in Dset(Ti) and Dset(Tj ) respectively such that {X,Y} G Wset(T m ). This is because Tj and 
Tj would both read status[m] in Lines 15 (during t-read) and 39 (during t-write) only if T m 
was the previous owner of X and Y. Secondly, one of Tj or Tj applies a nontrivial primitive 
to status[m] only if Tj and Tj read status[m]=live in Lines [l5| (during t-read) and [37] (during 
t-write). Thus, at least one of T, L or Tj is concurrent to T m in E. It follows that there exists 
a path between X and Y in G(Ti,Tj, E). 

(Complexity) Every t-read operation performs at most one AWAR in an execution E (Line 21) of 
Algorithm [2] It follows that any read-only transaction Tk G txns(E) performs at most \Rset(Tk)\ 
AWARs in E. 

The linear step-complexity is immediate from the fact that during the t-read operations, the 
transaction validates its entire read set (Line 25). All other t-operations incur 0(1) step-complexity 
since they involve no iteration statements like for and while loops. 

Since at most n—1 transactions may be t-incomplete at any point in an execution E, it follows 
that E is at most a (n — l)-stall execution for any t-read op and every T G txns(E) incurs 0(n) 
stalls on account of any event performed in E. More specifically, consider the following execution 
E: for all? G {1,..., n — 1}, each transaction Tj performs writei(X m ,v ) in a step-contention free 
execution until it is poised to apply a nontrivial event on tvar[m] (Line [49]). By OF TM-progress, 
we construct E such that each of the Tj is poised to apply a nontrivial event on tvar[m} after E. 
Consider the execution fragment of readnfX m ) that is poised to perform an event e that reads 
tvar[m ] (Line 13) immediately after E. In the constructed execution, T n incurs 0(n ) stalls on 
account of e and thus, produces the desired (n — l)-stall execution for readn(X). □ 
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Algorithm 3 Weak DAP opaque implementation M E OT ; code for 

1 : readk^Xfn): 

2: [owner m , ovalm, nval m ] «— tvar[m]. read() 

3 : if owner m A k then 

4 -. Sm <— status[ownerm\- read() 

5 : if Sm = committed, then 

6 : curr = nvalm 

7: else if s m = aborted then 

8 : curr = ovalm 

9 : else 

10: if status\owner m \.cas(live, aborted) then 

ii: curr = ovalm 

12: Return Ak 

13 : Om <— tvar[m].cas([owner m , oval m , nvalm], [k, oval m , nvalm]) 

14: if Om A status [fc] = live then 

15 : Rset(T k ).add({Xm, [owner m , ovalm, nval m ]}) 

16: Return curr 

17 : else 

18: Return Rset(Tk).\ocate{X m ) 

19 : tryCkO- 

20: if status[k].cas(live, committed) then 

2i: Return Ck 

22: Return Ak 


B.2 An opaque weak DAP implementation M e OT 

Algorithm [3] describes a weak DAP implementation in OT that does not satisfy read-write DAP. 
The code for the t-write operations is identical to Algorithm [2j 

Theorem 15. Algorithm [3] describes a weak TM implementation M E OT such that in any 
execution E of M, for every transaction T E txns(E), T performs 0(1) steps during the execution 
of any t-operation in E. 


Proof. The proofs of opacity, TM-liveness and TM-progress are almost identical to the analogous 
proofs for Algorithm [2] 

(Weak disjoint-access parallelism) Consider any execution E of Algorithm [3] and let Tj and Tj 
be any two transactions that contend on a base object b in E. We need to prove that there is 
a path between a t-object in Dsei(Tj) and a t-object in Dset{Tj ) in G(Ti,Tj, E) or there exists 
X E Dset(Ti)nDset(Tj). Recall that there exists an edge between t-objects X and Y in G(Tj, T), E) 
only if there exists a transaction T E txns(E) such that {X, Y} E Dset(T). 

• Suppose that Tj and Tj contend on base object tvar[ m] belonging to t-object X m in E. By 
Algorithm [3j a transaction accesses X m only if X m is contained in Dset(T m ). Thus, both Tj 
and Tj must access X rn . 

• Suppose that Tj and Tj contend on base object status[ i] in E (the case when 7) and Tj 


contend on status] j] is symmetric) 
t-object X in Lines [4] and 


Tj accesses statusfij while performing a t-read of some 


10 


only if Tj is the owner of X. Also, Tj accesses statusfij while 
performing a t-write to X in Lines 39 and 45 only if Tj is the owner of X. But if Tj is the 
owner of X, then X E DsetfTf). 

Suppose that Tj and Tj contend on base object status] m] belonging to some transaction T m 
in E. Firstly, observe that T or Tj access status[m] only if there exist t-objects X and Y 
in Dset{Ti) and Dset(Tj) respectively such that {A, Y} E Dset{T m ). This is because Tj and 
Tj would both read status[m] in Lines [4] (during t-read) and 39 (during t-write) only if T m 
was the previous owner of X and Y. Secondly, one of Tj or Tj applies a nontrivial primitive 
to status[m] only if Tj and Tj read status[m]=live in Lines [4] (during t-read) and 37 (during 
t-write). Thus, at least one of Tj or Tj is concurrent to T m in E. It follows that there exists 
a path between X and Y in G(T, Tj, E). 
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(Complexity) Since no implementation of any of the t-operation contains any iteration statements 
like for and while loops), the proof follows. □ 
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